12 research outputs found
Tutorial on algebraic deletion correction codes
The deletion channel is known to be a notoriously diffcult channel to design
error-correction codes for. In spite of this difficulty, there are some
beautiful code constructions which give some intuition about the channel and
about what good deletion codes look like. In this tutorial we will take a look
at some of them. This document is a transcript of my talk at the coding theory
reading group on some interesting works on deletion channel. It is not intended
to be an exhaustive survey of works on deletion channel, but more as a tutorial
to some of the important and cute ideas in this area. For a comprehensive
survey, we refer the reader to the cited sources and surveys.
We also provide an implementation of VT codes that correct single
insertion/deletion errors for general alphabets at
https://github.com/shubhamchandak94/VT_codes/
Neural Joint Source-Channel Coding
For reliable transmission across a noisy communication channel, classical
results from information theory show that it is asymptotically optimal to
separate out the source and channel coding processes. However, this
decomposition can fall short in the finite bit-length regime, as it requires
non-trivial tuning of hand-crafted codes and assumes infinite computational
power for decoding. In this work, we propose to jointly learn the encoding and
decoding processes using a new discrete variational autoencoder model. By
adding noise into the latent codes to simulate the channel during training, we
learn to both compress and error-correct given a fixed bit-length and
computational budget. We obtain codes that are not only competitive against
several separation schemes, but also learn useful robust representations of the
data for downstream tasks such as classification. Finally, inference
amortization yields an extremely fast neural decoder, almost an order of
magnitude faster compared to standard decoding methods based on iterative
belief propagation
Minimax redundancy for Markov chains with large state space
For any Markov source, there exist universal codes whose normalized
codelength approaches the Shannon limit asymptotically as the number of samples
goes to infinity. This paper investigates how fast the gap between the
normalized codelength of the "best" universal compressor and the Shannon limit
(i.e. the compression redundancy) vanishes non-asymptotically in terms of the
alphabet size and mixing time of the Markov source. We show that, for Markov
sources whose relaxation time is at least , where
is the state space size (and is a constant), the phase transition for
the number of samples required to achieve vanishing compression redundancy is
precisely .Comment: 22 pages, 1 figur
DZip: improved general-purpose lossless compression based on novel neural network modeling
We consider lossless compression based on statistical data modeling followed
by prediction-based encoding, where an accurate statistical model for the input
data leads to substantial improvements in compression. We propose DZip, a
general-purpose compressor for sequential data that exploits the well-known
modeling capabilities of neural networks (NNs) for prediction, followed by
arithmetic coding. Dzip uses a novel hybrid architecture based on adaptive and
semi-adaptive training. Unlike most NN based compressors, DZip does not require
additional training data and is not restricted to specific data types, only
needing the alphabet size of the input data. The proposed compressor
outperforms general-purpose compressors such as Gzip (on average 26% reduction)
on a variety of real datasets, achieves near-optimal compression on synthetic
datasets, and performs close to specialized compressors for large sequence
lengths, without any human input. The main limitation of DZip in its current
implementation is the encoding/decoding time, which limits its practicality.
Nevertheless, the results showcase the potential of developing improved
general-purpose compressors based on neural networks and hybrid modeling.Comment: Updated manuscript and an efficient implementation adde
IncSQL: Training Incremental Text-to-SQL Parsers with Non-Deterministic Oracles
We present a sequence-to-action parsing approach for the natural language to
SQL task that incrementally fills the slots of a SQL query with feasible
actions from a pre-defined inventory. To account for the fact that typically
there are multiple correct SQL queries with the same or very similar semantics,
we draw inspiration from syntactic parsing techniques and propose to train our
sequence-to-action models with non-deterministic oracles. We evaluate our
models on the WikiSQL dataset and achieve an execution accuracy of 83.7% on the
test set, a 2.1% absolute improvement over the models trained with traditional
static oracles assuming a single correct target SQL query. When further
combined with the execution-guided decoding strategy, our model sets a new
state-of-the-art performance at an execution accuracy of 87.1%
Optimal Communication Rates and Combinatorial Properties for Common Randomness Generation
We study common randomness generation problems where players aim to
generate same sequences of random coin flips where some subsets of the players
share an independent common coin which can be tossed multiple times, and there
is a publicly seen blackboard through which the players communicate with each
other. We provide a tight representation of the optimal communication rates via
linear programming, and more importantly, propose explicit algorithms for the
optimal distributed simulation for a wide class of hypergraphs. In particular,
the optimal communication rate in complete hypergraphs is still achievable in
sparser hypergraphs containing a path-connected cycle-free cluster of
topologically connected components. Some key steps in analyzing the upper
bounds rely on two different definitions of connectivity in hypergraphs, which
may be of independent interest.Comment: 23 pages, 10 figure
Towards improved lossy image compression: Human image reconstruction with public-domain images
Lossy image compression has been studied extensively in the context of
typical loss functions such as RMSE, MS-SSIM, etc. However, compression at low
bitrates generally produces unsatisfying results. Furthermore, the availability
of massive public image datasets appears to have hardly been exploited in image
compression. Here, we present a paradigm for eliciting human image
reconstruction in order to perform lossy image compression. In this paradigm,
one human describes images to a second human, whose task is to reconstruct the
target image using publicly available images and text instructions. The
resulting reconstructions are then evaluated by human raters on the Amazon
Mechanical Turk platform and compared to reconstructions obtained using
state-of-the-art compressor WebP. Our results suggest that prioritizing
semantic visual elements may be key to achieving significant improvements in
image compression, and that our paradigm can be used to develop a more
human-centric loss function.
The images, results and additional data are available at
https://compression.stanford.edu/human-compressio
LFZip: Lossy compression of multivariate floating-point time series data via improved prediction
Time series data compression is emerging as an important problem with the
growth in IoT devices and sensors. Due to the presence of noise in these
datasets, lossy compression can often provide significant compression gains
without impacting the performance of downstream applications. In this work, we
propose an error-bounded lossy compressor, LFZip, for multivariate
floating-point time series data that provides guaranteed reconstruction up to
user-specified maximum absolute error. The compressor is based on the
prediction-quantization-entropy coder framework and benefits from improved
prediction using linear models and neural networks. We evaluate the compressor
on several time series datasets where it outperforms the existing
state-of-the-art error-bounded lossy compressors. The code and data are
available at https://github.com/shubhamchandak94/LFZi
Reducing latency and bandwidth for video streaming using keypoint extraction and digital puppetry
COVID-19 has made video communication one of the most important modes of
information exchange. While extensive research has been conducted on the
optimization of the video streaming pipeline, in particular the development of
novel video codecs, further improvement in the video quality and latency is
required, especially under poor network conditions. This paper proposes an
alternative to the conventional codec through the implementation of a
keypoint-centric encoder relying on the transmission of keypoint information
from within a video feed. The decoder uses the streamed keypoints to generate a
reconstruction preserving the semantic features in the input feed. Focusing on
video calling applications, we detect and transmit the body pose and face mesh
information through the network, which are displayed at the receiver in the
form of animated puppets. Using efficient pose and face mesh detection in
conjunction with skeleton-based animation, we demonstrate a prototype requiring
lower than 35 kbps bandwidth, an order of magnitude reduction over typical
video calling systems. The added computational latency due to the mesh
extraction and animation is below 120ms on a standard laptop, showcasing the
potential of this framework for real-time applications. The code for this work
is available at https://github.com/shubhamchandak94/digital-puppetry/.Comment: 10 pages, 5 figures, 1-page summary to be published at DCC 2021.
Revision: added reference
Robust Text-to-SQL Generation with Execution-Guided Decoding
We consider the problem of neural semantic parsing, which translates natural
language questions into executable SQL queries. We introduce a new mechanism,
execution guidance, to leverage the semantics of SQL. It detects and excludes
faulty programs during the decoding procedure by conditioning on the execution
of partially generated program. The mechanism can be used with any
autoregressive generative model, which we demonstrate on four state-of-the-art
recurrent or template-based semantic parsing models. We demonstrate that
execution guidance universally improves model performance on various
text-to-SQL datasets with different scales and query complexity: WikiSQL, ATIS,
and GeoQuery. As a result, we achieve new state-of-the-art execution accuracy
of 83.8% on WikiSQL